Skip to main content
Scour
Browse
Getting Started
Login
Sign Up
You are offline. Trying to reconnect...
Close
Copied to clipboard
Close
Unable to share or copy to clipboard
Close
馃幃 Reinforcement Learning
Specific
RL, reward functions, policy gradient, RLHF
Filter Results
Timeframe
Fresh
Past Hour
Today
This Week
This Month
Feeds to Scour
Subscribed
All
Scoured
184638
posts in
20.6
ms
Policy
Improvement
Reinforcement
Learning
聽
鈾燂笍
Game Theory
arxiv.org
路
1d
How does
Reinforcement
Learning
Affect
Models
聽
馃挰
LLMs
lesswrong.com
路
3d
The Data
Layer
Tax for Robot Learning
聽
馃
Machine Learning
rerun.io
路
1h
路
Hacker News
There Will Be a
Scientific
Theory of Deep Learning
聽
馃
AI
mail.bycloud.ai
路
18h
How to build custom reasoning agents with a
fraction
of the
compute
聽
馃挰
LLMs
venturebeat.com
路
1d
Adaptive home energy management to
self-motivated
user
preferences
via iterative LLM-augmented reinforcement learning
聽
馃
AI Agents
sciencedirect.com
路
5d
Learning diverse natural behaviors for enhancing the
agility
of
quadrupedal
robots
聽
馃
AI Agents
nature.com
路
1d
DEEP
Robotics
聽
馃
Neural Networks
youtube.com
路
2d
路
r/singularity
A new GitHub
repo
to detect reward hacking in
RL
models
聽
馃
AI Agents
github.com
路
4d
路
Hacker News
Artificial Intelligence:
Foundations
of
Computational
Agents
聽
馃
AI Agents
artint.info
路
2d
路
Hacker News
On-Policy vs Off-Policy RL:
PPO
vs SAC on 5
Gymnasium
Tasks
聽
馃
AI Agents
tildalice.io
路
3d
Show HN: A live
autonomous
economic network for AI agents
聽
馃
AI Agents
ainetwork-global.github.io
路
2d
路
Hacker News
Jaxpot
: Train self-play RL agents FAST by
parallelizing
environments on GPU
聽
馃
AI Agents
bardsai.substack.com
路
2d
路
Substack
Accelerate RL
rollouts
by up to 50% with distribution-aware
speculative
decoding
聽
馃挰
LLMs
together.ai
路
6d
The Policy Picks the Policy
聽
馃
AI Agents
noise2signal.bearblog.dev
路
1d
Fixing
What LLMs Get Wrong (22 minute read)
聽
馃挰
LLMs
thebigdataguy.substack.com
路
3d
路
Substack
Lyapunov-Guided
Self-Alignment: Test-Time Adaptation for
Offline
Safe Reinforcement Learning
聽
馃挰
LLMs
arxiv.org
路
9h
RL
, in
pictures
and videos
聽
馃
AI Agents
suriya.cc
路
5d
Jamie Simon and Daniel Kunin, UC Berkeley: There Will Be a Scientific Theory of Deep
LearningPodcastApril
24,
2026Read
more
聽
馃
AI
imbue.com
路
6d
Software Agents: The management challenge
聽
馃
AI Agents
hypecycles.com
路
5d
Page 2 »
Log in to enable infinite scrolling
Keyboard Shortcuts
Navigation
Next / previous item
j
/
k
Open post
o
or
Enter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Save / unsave
s
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
g
h
Interests
g
i
Feeds
g
f
Likes
g
l
History
g
y
Changelog
g
c
Settings
g
s
Browse
g
b
Search
/
Pagination
Next page
n
Previous page
p
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc
Press
?
anytime to show this help